Skip to content

fix: optimize email search and require filters to prevent timeouts#117

Open
Acid-Override wants to merge 15 commits intoai-zerolab:mainfrom
Acid-Override:fix/optimize-email-search-and-require-filters
Open

fix: optimize email search and require filters to prevent timeouts#117
Acid-Override wants to merge 15 commits intoai-zerolab:mainfrom
Acid-Override:fix/optimize-email-search-and-require-filters

Conversation

@Acid-Override
Copy link

Summary

Addresses timeout issues in list_emails_metadata() by optimizing IMAP searches and preventing expensive 'ALL' email searches on large mailboxes.

Problem

  • When list_emails_metadata() is called without date filters, it performs an expensive uid_search("ALL") operation
  • On large mailboxes (1000+ emails), this can take hours or timeout
  • Additionally, the method was performing the same IMAP search twice: once in get_emails_metadata_stream() and again in get_email_count()

Solution

1. Search Optimization (~50% faster)

  • Cache the email count from the initial IMAP search in get_emails_metadata_stream()
  • Eliminate the redundant second search in get_email_count()
  • Use the cached _last_search_total instead

2. Safety Guard (Prevents hangs)

  • Add validation in get_emails_metadata() requiring at least one filter
  • Allowed filters: since, before, subject, from_address, to_address, seen, flagged, answered
  • Raise helpful ValueError with guidance if no filters provided

Changes Made

File: mcp_email_server/emails/classic.py

  1. EmailClient.init (line 104-105): Add cache for search total

    • Added self._last_search_total = None
  2. get_emails_metadata_stream() (line 452-453): Cache the search result

    • Store total_found in self._last_search_total
  3. ClassicEmailHandler.get_emails_metadata() (line 949-957):

    • Add filter validation requiring at least one search criterion
    • Use cached total instead of calling get_email_count()

Benefits

  • 50% faster for filtered searches (eliminates duplicate search)
  • Prevents hangs on large mailboxes
  • Forces best practices (users must specify what they're searching for)
  • Clear error messages guide users to provide filters

Testing

Tested with the Galaxia email account:

  • ✅ With date filter: Returns 3 emails instantly (43 total matched)
  • ✅ Without filter: Shows helpful error message

Anonymous and others added 6 commits February 8, 2026 22:57
Add two new MCP tools for email management:
- mark_emails_as_read: Mark emails as read/unread using IMAP \Seen flag
- move_emails: Move emails between mailboxes using MOVE (RFC 6851)
  with fallback to COPY+DELETE for older servers

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Fix bug where status messages like "SEARCH completed (took 5 ms)"
were incorrectly parsed as email UIDs. The number in the timing
info (e.g., "5") was being treated as a valid UID.

Add _parse_search_response() method that:
- Detects status messages by checking for keywords
- Returns empty list for status-only responses
- Only returns actual numeric UIDs from valid responses

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add new MCP tool to list all mailboxes (folders) in an email account.
Returns mailbox name, flags, and hierarchy delimiter.

Useful for discovering folder names like Archive, Sent, Trash which
may vary across email providers (e.g., iCloud uses different names).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Performance optimization:
- Don't fetch INTERNALDATE for all emails when paginating
- Use UID ordering directly (UIDs are ascending by add date)
- Only fetch headers for the requested page

New search_emails tool:
- Server-side IMAP SEARCH (fast even with thousands of emails)
- Search in: all (TEXT), subject, body, or from
- Paginated results

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Use ternary operator in _parse_search_response (SIM108)
- Replace raise Exception with logging + failed_ids (TRY301)
- Add tests for list_mailboxes, search_emails, mark_emails_as_read, move_emails
- Update test_get_emails_stream to match optimized pagination behavior

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@Acid-Override Acid-Override force-pushed the fix/optimize-email-search-and-require-filters branch from a340e9a to 606255c Compare February 10, 2026 01:35
@codecov
Copy link

codecov bot commented Feb 10, 2026

Codecov Report

❌ Patch coverage is 34.30657% with 180 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
mcp_email_server/emails/classic.py 30.3% 156 Missing and 7 partials ⚠️
mcp_email_server/app.py 32.0% 17 Missing ⚠️

📢 Thoughts on this report? Let us know!

@Acid-Override
Copy link
Author

Merge Conflict with PR #116 ⚠️

I've identified a potential merge conflict between this PR and PR #116 ("feat: add email management tools").

🔴 Major Conflict: get_emails_metadata_stream()

PR #116 refactors this method to:

  • Remove batch date fetching (_batch_fetch_dates())
  • Use direct UID ordering instead of INTERNALDATE
  • Simplify pagination logic significantly

PR #117 (this PR) modifies the same method to:

  • Add search result caching (self._last_search_total)
  • Keep batch date fetching for INTERNALDATE sorting
  • Implement filter validation

📊 Conflict Impact

Component PR #116 PR #117 Status
Pagination strategy UID-based INTERNALDATE-based ❌ Conflicting
Date fetching Removes Keeps ❌ Conflicting
New tools 5 new methods Filter validation ✅ Compatible
Caching Not used Added ⚠️ Minor

💡 Recommended Solution

These PRs solve complementary problems and should both merge:

Suggestion: Combine both optimizations by:

  1. Use PR feat: add email management tools (mark read, move, search, list mailboxes) #116's UID-based sorting (faster, no date fetching needed)
  2. Keep PR fix: optimize email search and require filters to prevent timeouts #117's filter validation (prevents expensive searches)
  3. Add PR fix: optimize email search and require filters to prevent timeouts #117's caching on top of PR feat: add email management tools (mark read, move, search, list mailboxes) #116's optimized code

📋 Files Affected

✅ Next Steps

Either approach works:

Both changes are valuable and compatible in intent—just need careful merge strategy.

WxAtwoo and others added 9 commits February 18, 2026 10:39
- Cache search total to eliminate duplicate IMAP search (~50% performance improvement)
- Add validation requiring at least one filter (date, subject, from, to, seen, flagged, or answered)
- Prevents expensive 'ALL' email searches on large mailboxes that could timeout
- Provides clear, helpful error message when no filters are specified

Fixes issue where list_emails_metadata() would hang indefinitely on large mailboxes
when called without date filters.
- Mock _last_search_total to replace the removed get_email_count call
- Add 'before' filter to test_get_emails_with_mailbox to satisfy validation
- Remove unused mock_count assertions that no longer apply
- Split long filter validation line for readability
- Add test_get_emails_requires_filter to test ValueError when no filters provided
- Improves code coverage for the validation error message
- All 131 tests passing
…ance

- Clarify that combining filters is recommended best practice
- Text searches alone work but can be slow on massive mailboxes
- Suggest combining date ranges with optional text filters
- More helpful examples for users
…requirements

- Explains IMAP protocol limitations preventing 'first N emails' queries
- Documents why filters are required to prevent expensive mailbox scans
- Provides performance comparison of different search strategies
- Includes best practices and examples for efficient searches
- Migration guide for users who relied on unfiltered searches
- FAQ addressing common questions and concerns

This helps users and maintainers understand the architectural decision
and provides clear guidance on optimized email searching.
- Add EMAIL_SEARCH_PERFORMANCE.md to nav configuration
- Fix broken reference to ../README.md by using GitHub repo link instead

This resolves the mkdocs build failure in strict mode where undocumented files
and relative links outside the docs directory are not allowed.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
… (search optimization + filter validation)

Combined optimizations:
- PR ai-zerolab#116: Add list_mailboxes, move_emails, mark_emails_as_read, search_emails tools
- PR ai-zerolab#116: UID-based pagination optimization (60s+ → <5s on large mailboxes)
- PR ai-zerolab#117: Filter validation (prevents accidental expensive searches)
- PR ai-zerolab#117: Search result caching (_last_search_total)

Conflict resolution: Merged parse_search_response logic with caching to avoid duplicate searches.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
@Acid-Override Acid-Override force-pushed the fix/optimize-email-search-and-require-filters branch from 741a750 to b2c3298 Compare February 18, 2026 16:20
@Acid-Override
Copy link
Author

✅ Merged with PR #116 - Combined Optimizations Complete

We've successfully resolved the merge conflict between this PR (#117) and PR #116 by combining both optimizations into a single, comprehensive solution.

🎯 What We Did

Merged both PRs into a unified branch:

Conflict Resolution:

✨ What We Got

Performance Optimizations:

  • ✅ UID-based pagination (60s+ → <5s on large mailboxes) - no date fetching needed
  • ✅ Search result caching to eliminate duplicate IMAP searches (~50% speedup)
  • ✅ Filter validation preventing accidental "ALL" searches

New Email Management Tools:

  • list_mailboxes() - Discover all folders in account
  • move_emails() - Move emails between folders
  • mark_emails_as_read() - Mark as read/unread
  • search_emails() - Server-side text search

🧪 Testing

Verified working:

  • ✅ Listed 55 mailboxes from test account
  • ✅ Successfully moved IBM email from INBOX → INBOX.Education
  • ✅ Searched emails in specific folder (found moved email)
  • ✅ Marked email as read
  • ✅ All 131 tests pass
  • ✅ Code formatting validated

📊 Impact

This combined approach gives users both fast pagination AND safe search operations. The filter validation prevents the timeout issues that would occur with large mailboxes, while the UID-based optimization makes pagination itself lightning-fast.

Branch: fix/optimize-email-search-and-require-filters
Latest commit: b2c3298 - Merge PR #116 with PR #117
Pushed to fork: Ready for upstream merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant